Linear reduction method for predictive and informative tag SNP selection

نویسندگان

  • Jingwu He
  • Kelly Westbrooks
  • Alex Zelikovsky
چکیده

Constructing a complete human haplotype map is helpful when associating complex diseases with their related SNPs. Unfortunately, the number of SNPs is very large and it is costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNPs that should be sequenced to a small number of informative representatives called tag SNPs. In this paper, we propose a new linear algebra-based method for selecting and using tag SNPs. We measure the quality of our tag SNP selection algorithm by comparing actual SNPs with SNPs predicted from selected linearly independent tag SNPs. Our experiments show that for sufficiently long haplotypes, knowing only 0.4% of all SNPs the proposed linear reduction method predicts an unknown haplotype with the error rate below 2% based on 10% of the population.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MLR-tagging: informative SNP selection for unphased genotypes based on multiple linear regression

UNLABELLED The search for the association between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes has recently received great attention. For these studies, it is essential to use a small subset of informative SNPs accurately representing the rest of the SNPs. Informative SNP selection can achieve (1) considerable budget savings by genotyping only a limited number of SN...

متن کامل

Two Birds, One Stone: Selecting Functionally Informative Tag SNPs for Disease Association Studies

Selecting an informative subset of SNPs, generally referred to as tag SNPs, to genotype and analyze is considered to be an essential step toward effective disease association studies. However, while the selected informative tag SNPs may characterize the allele information of a target genomic region, they are not necessarily the ones directly associated with disease or with functional impairment...

متن کامل

Tag SNP Selection Based on Multivariate Linear Regression

The search for the association between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes has been recently received great attention. For these studies, it is essential to use a small subset of informative SNPs (tag SNPs) accurately representing the rest of the SNPs. Tagging can achieve budget savings by genotyping only a limited number of SNPs and computationally inferri...

متن کامل

A Novel Prediction Method for Tag SNP Selection using Genetic Algorithm based on KNN

Single nucleotide polymorphisms (SNPs) hold much promise as a basis for disease-gene association. However, research is limited by the cost of genotyping the tremendous number of SNPs. Therefore, it is important to identify a small subset of informative SNPs, the so-called tag SNPs. This subset consists of selected SNPs of the genotypes, and accurately represents the rest of the SNPs. Furthermor...

متن کامل

Tag SNP selection in genotype data for maximizing SNP prediction accuracy

MOTIVATION The search for genetic regions associated with complex diseases, such as cancer or Alzheimer's disease, is an important challenge that may lead to better diagnosis and treatment. The existence of millions of DNA variations, primarily single nucleotide polymorphisms (SNPs), may allow the fine dissection of such associations. However, studies seeking disease association are limited by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of bioinformatics research and applications

دوره 1 3  شماره 

صفحات  -

تاریخ انتشار 2005